Multi-Modal Streaming 3D Object Detection

نویسندگان

چکیده

Modern autonomous vehicles rely heavily on mechanical LiDARs for perception. Current perception methods generally require $360^\circ$ point clouds, collected sequentially as the LiDAR scans azimuth and acquires consecutive wedge-shaped slices. The acquisition latency of a full scan ( notation="LaTeX">$\sim 100ms$ ) may lead to outdated which is detrimental safe operation. Recent streaming works proposed directly processing slices compensating narrow field view (FOV) slice by reusing features from preceding These works, however, are all based single modality past information be outdated. Meanwhile, images high-frequency cameras can support models they provide larger FoV compared slice. However, this difference in complicates sensor fusion. We propose an innovative camera-LiDAR 3D object detection framework that uses camera instead up-to-date, dense, wide context method outperforms prior powerful full-scan baselines challenging NuScenes benchmark accuracy end-to-end runtime. Our shown robust missing images, slices, small miscalibration. Project website: mmstream.github.io

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving 3D perception for Object Detection, Classification and Localization using Fused Multi-modal Sensors

Object perception in 3-D is a highly challenging problem in computer vision. The major concern in these tasks involves object occlusion, different object poses, appearance and limited perception of the environment by individual sensors in terms of range measurements. In this particular project, our goal is improving 3D perception of the environment by using fusion from lidars and cameras with f...

متن کامل

Object detection in multi-modal images using genetic programming

In this paper, we learn to discover composite operators and features that are synthesized from combinations of primitive image processing operations for object detection. Our approach is based on genetic programming (GP). The motivation for using GP-based learning is that we hope to automate the design of object detection system by automatically synthesizing object detection procedures from pri...

متن کامل

A Streaming Object Oriented Implementation of the Modal Distribution

The Modal distribution is a time-frequency distribution specifically designed to model the quasi-harmonic, multi-sinusoidal, nature of music signals and belongs to the Cohen general class of time-frequency distributions. A streaming, object-oriented implementation of the Modal distribution is presented which forms the basis for designing other members of the Cohen class. Implementation of this ...

متن کامل

Multi-modal Tracking for Object based SLAM

We present an on-line 3D visual object tracking framework for monocular cameras by incorporating spatial knowledge and uncertainty from semantic mapping along with high frequency measurements from visual odometry. Using a combination of vision and odometry that are tightly integrated we can increase the overall performance of object based tracking for semantic mapping. We present a framework fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE robotics and automation letters

سال: 2023

ISSN: ['2377-3766']

DOI: https://doi.org/10.1109/lra.2023.3303696